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ABSTRACT 

We study and formulate arbitrage in display advertising. 
Real-Time Bidding (RTB) mimics stock spot exchanges and 
utilises computers to algorithmically buy display ads per 
impression via a real-time auction. Despite the new au¬ 
tomation, the ad markets are still informationally inefficient 
due to the heavily fragmented marketplaces. Two display 
impressions with similar or identical effectiveness (e.g., mea¬ 
sured by conversion or click-through rates for a targeted 
audience) may sell for quite different prices at different mar¬ 
ket segments or pricing schemes. In this paper, we pro¬ 
pose a novel data mining paradigm called Statistical Ar¬ 
bitrage Mining (SAM) focusing on mining and exploiting 
price discrepancies between two pricing schemes. In essence, 
our SAMer is a meta-bidder that hedges advertisers’ risk 
between CPA (cost per action)-based campaigns and CPM 
(cost per mille impressions)-based ad inventories; it statis¬ 
tically assesses the potential profit and cost for an incom¬ 
ing CPM bid request against a portfolio of CPA campaigns 
based on the estimated conversion rate, bid landscape and 
other statistics learned from historical data. In SAM, (i) 
functional optimisation is utilised to seek for optimal bid¬ 
ding to maximise the expected arbitrage net profit, and (ii) 
a portfolio-based risk management solution is leveraged to 
reallocate bid volume and budget across the set of campaigns 
to make a risk and return trade-off. We propose to jointly 
optimise both components in an EM fashion with high effi¬ 
ciency to help the meta-bidder successfully catch the tran¬ 
sient statistical arbitrage opportunities in RTB. Both the 
offline experiments on a real-world large-scale dataset and 
online A/B tests on a commercial platform demonstrate the 
effectiveness of our proposed solution in exploiting arbitrage 
in various model settings and market environments. 

Keywords 

Statistical Arbitrage, Real-Time Bidding, Display Ads 

1. INTRODUCTION 

“Half the money I spend on advertising is wasted; the trou¬ 
ble is I don’t know which half.” 

John Wanamaker (July 11, 1838 - December 12, 1922) 

Permission to make digital or hard copies of all or part of this work for personal or 
classroom use is granted without fee provided that copies are not made or distributed 
for profit or commercial advantage and that copies bear this notice and the full citation 
on the first page. Copyrights for components of this work owned by others than 
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or 
republish, to post on servers or to redistribute to lists, requires prior specific permission 
and/or a fee. Request permissions from Permissions@acm.org. 

KDD’15, August 10-13, 2015, Sydney, NSW, Australia. 

© 2015 ACM. ISBN 978-1-4503-3664-2/15/08 ...$15.00. 

DOI: http://dx.doi.org/10.1145/2783258.2783269 


A popular quotation from John Wanamaker, a pioneer in 
advertising and department stores, illustrates how difficult it 
was to quantify the response and performance in advertising 
a hundred years ago. Over the last twenty years, advance¬ 
ment of the World Wide Web has fundamentally changed 
this by providing an effective feedback mechanism to mea¬ 
sure the response through observing users’ search queries, 
navigation patterns, clicks, conversions etc. Recently, Real- 
Time Bidding (RTB) has emerged to be a frontier for In¬ 
ternet advertising 27, 17 . It mimics stock spot exchanges 
and utilises computers to programmatically buy display ads 
in real-time and per impression via an auction mec hani sm 
between buyers (advertisers) and sellers (publishers) [32] . 

Such automation not only improves efficiency and scales 
of the buying process across lots of available inventories, 
but, most importantly, encourages performance driven ad¬ 
vertising based on targeted clicks, conversions etc., by using 
real-time audience data. As a result, ad impressions become 
more and more commoditised in the sense that the effective¬ 
ness (quality) of an ad impression does not rely on where it is 
bought or whom it belongs to any more, but depends on how 
much it will benefit the campaign target (e.g., underlying 
Web users’ satisfactions and their direct responses rj 

According to the Efficient Market Hypothesis (EMH) in 
finance, in a perfectly “efficient” market, security (such as 
stock) prices should fully reflect all available information at 
any time 13 . As such, no arbitrage opportunity exists, i.e., 
one can neither buy securities which are worth more than 
the selling price, nor sell securities worth l ess than the selling 
price without making riskier investment [18| . However, due 
to the heavily-fragmented, non-transparent ad marketplaces 
and the existence of various ad types, e.g., sponsored search, 
display ads, affiliated networks, and pricing schemes, e.g., 
cost per mille impressions (CPM), cost per click (CPC), cost 
per action (CPA), the ad markets are not informationally 
efficient. In other words, two display opportunities with 
similar or identical targeted audiences and visit frequency 
may sell for quite different prices. While exploiting such 
price discrepancies is still debatable in the advertising field, 
the following four arbitrage situations exist: 

I Inter-exchange arbitrage. Multiple ad exchanges 
exist. As the supply and demand vary across exchanges 
for the same user types or targeting rules, there exist 
intermediary agencies that act as a buyer with low bid 
in exchange A and as a seller with high reserve price 
in exchange B in order to make profits [ 2 ]. 


1 Our discussion is limited to performance driven ads and 
direct responses such as clicks and conversions only, whereas 
for the purpose of branding, the quality of publishers still 
play an important role in defining the ad inventory quality. 




II Guaranteed delivery and spot market arbitrage. 

Some demand-side platforms (DSPs) offer advertisers 
the contracts with guaranteed delivery [ 3 ] while buy¬ 
ing ad inventories over an RTB exchange with non- 
guaranteed spot prices [25]. Conversely, some ad agen¬ 
cies buy inventories in advance in bulk for fixed “prefer¬ 
ential rates” from private marketplaces, and then charge 
a client for their campaigns with the spot prices. 

III Publisher volume I/O arbitrage. A publisher can 
purchase traffic to her Web page and subsequently make 
more from ad revenue than the initial inbound click 
cost. An extreme case is a homepage purely dedicated 
to host ads: the Million Dollar Homepage]"] 

IV Pricing scheme arbitrage. In RTB, different counter¬ 
parties prefer different p ricin g schemes in order to re¬ 
duce their risk of deficit [19]. CPM is commonly used 
for RTB auction and preferred by publishers because it 
is likely to generate stable income from the site volume. 
By contrast, advertisers focusing on performance are 
likely to follow CPA and CPC pricing schemes as they 
are directly related to return on investment (ROI) 10 . 
As such, if the CPM cost to acquire a user conversion 
is less than the CPA payoff for the conversion, an in¬ 
termediate agent can earn a positive profit. 

Scientifically, this is of great interest as it presents a new 
type of data mining problem, which demands a principled 
mathematical formulation and novel computational solution 
to mine and exploit arbitrage opportunities in real-time dis¬ 
play advertising. Commercially and socially, principled ad 
arbitrage algorithms would not only ensure the business more 
smooth and risk free (e.g., Ill & IV), but also make the ad 
markets more transparent and informationally more efficient 
(e.g., I, II & IV) by connecting otherwise segmented markets 
to correct the misallocation of risks and prices, and eventu¬ 
ally reach an “arbitrage free” equilibrium. 

In this paper, we formulate Statistical Arbitrage Mining 
(SAM) and present a solution in the context of display adver¬ 
tising. We focus on modelling discrepancies between CPA- 
based campaigns and CPM-based ad inventories (IV above), 
while the arbitrage models for the remaining cases can be 
obtained analogically. The studied arbitrage is a stochas¬ 
tic one due to the uncertainty of market supply/demand 
and users responses. The probability distribution of the ar¬ 
bitrage net profit from an ad display opportunity is esti¬ 
mated by user response predictors 21 and the bid landscape 
forecasting models j§] , trained on historic large-scale data. 
Essentially, the proposed Statistical Arbitrage Miner is a 
campaign-independent RTB bidder, which assesses the arbi¬ 
trage opportunity for an incoming CPM bid request against 
a portfolio of CPA campaigns, then selects a campaign and 
provides a bid accordingly. Different from previous work on 
per-campaign RTB bidding strategies [28| |34| , we introduce 
the concept of meta-bidder, which performs the bidding for 
a portfolio of ad campaigns, similar to a hedge fund holding 
a set of valid assets in financial markets. In our SAM frame¬ 
work, (i) functional optimisation is utilised to seek for an 
optimal bidding function to maximise the expected arbitrage 
net profit, and (ii) a portfolio-based risk management solu¬ 
tion is leveraged to reallocate the bidding volume and budget 
across multiple campaigns to make a trade-off between arbi¬ 
trage risk and return. We propose to jointly optimise those 
two components in an EM fashion with high efficiency to 
make meta-bidder successfully catch the transient statistical 
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arbitrage opportunities in RTB. Experiments on both large- 
scale datasets and online A/B tests demonstrate the large 
improvement of our proposed SAM solutions over the state- 
of-the-art baselines. 


2. RELATED WORK 


Display Advertising Optimisation. Before the emerg¬ 
ing of the auction-based RTB market, most research work 
on display advertising optimisation is about ad inventory 
allocation on behalf of publishers in order to max imis e the 
revenue with the guaranteed delivery constraints [2, 14]. The 
authors in further [ 3 ] propose an automatic model for pricing 
the guaranteed contracts based on the prices of the targeted 
individual user visits in a spot market. With the arrival of ad 
exchange and RTB, a lot of work emerges on auction-based 
optimisation for display advertising. On the publisher side, 
the placement-level re serv e price optimisation is studied in 
31 . The authors in [16] suggest that the publisher could 
act as a bidder on behalf of its guaranteed contracts so as 
to make smart inventory allocations among the guaranteed 
and non-guaranteed contracts. One step further, the pricing 
model of guaranteed contracts with the alternatives of RTB 
spot market is proposed in [7 . On the advertiser side, the 
bid optimisation for campaign performance improvement is 
studied. The authors in 20] propose a budget pacing scheme 
embedded in a campaign conversion revenue optimisation 
framework to maximise the campaign revenue. The authors 
in [28| [34] focus on a bidding function formulation to max¬ 
imise the campaign clicks. Bid landscape forecasting models 
[8] are studied to estimate the campaign’s impression volume 
and cost given a bidding function. 

The authors in [ 5 ] study auction mechanisms consider¬ 
ing arbitrage between CPC and CPM pricing schemes. The 
study aims for designing an auction mechanism on behalf 
of the ad exchange and yielding truthful bidding from ad¬ 
vertisers and truthful CTR reporting from arbitrageurs. By 
contrast, our work focuses on developing a statistical method 
for mining and exploiting arbitrage opportunities between 
CPA and CPM. 


Statistical Arbitrage in Finance. In financial mar¬ 
kets, as a trading strategy, statistical arbitrage is a quan¬ 
titative approach to security trading. It utilises statistical 
methods with high-frequency trading systems to detect sta¬ 
tistical mispricing of securities caused by market inefficiency 


to make profit with a large number of transactions 18 


Drawing an analogy with the statistical arbitrage of se¬ 
curity pairs trading [Tl] in finance, in our paper, the cam¬ 
paign’s CPA contract and its performance in RTB spot mar¬ 
kets can be regarded as a pair of correlated securities. Sta¬ 
tistically speaking, if the campaign’s performance in an RTB 
market ensures that the average cost to acquire a conversion 
(i.e., eCPA) is lower than the payoff from the CPA con¬ 
tract, then a statistical arbitrage opportunity exists. Such 
opportunity could also be considered to be caused by in¬ 
formational inefficiency of the advertising market where the 
advertisers fail to lower their CPA payoff when their cam¬ 
paigns in RTB spot market have a good performance. 

Modern Portfolio Theory in Finance. As Nobel Prize 
work 24 , modern portfolio theory (MPT) originates from 
modelling uncertainty of the return of financial assets. MPT 
utilises the mean-variance analysis to make an investment 
solution for any tradeoff between the expected return and 


the risk, or w.r.t. a reference investment 29 


Recently, MPT has been introduced into information re¬ 
trieval (IR) fields to model the expectation and uncertainty 






Table 1: Notations and descriptions. 


Meta-bidder 


Notation Description 

x The bid request represented by its features. 

p x (x) The probability density function of x. 

i The zth campaign in the meta-bidder portfolio. 

M Number of campaigns in the meta-bidder portfolio. 

Vi The payoff of campaign i for each conversion. 

R The variable of meta-bidder arbitrage net profit. 

C The variable of meta-bidder arbitrage cost. 

Q(x,i) The predicted CVR if i wins the auction of x. 

We occasionally use 0 to refer to a specific CVR. 

Pq(0) The probability density function of CVR 0 for 
campaign i. 

B The meta-bidder total budget. 

T The estimated number of bid requests during 
the arbitrage period. 

b(6 , r) The bidding function which returns the bid. 

b is also used to refer to a specific bid value. 
w (b) The probability of winning a bid request with 

bid price b. 

Vi The probability of selecting campaign i. 

For multiple campaigns, the campaign selection 
probability vector is v = (v\,V 2 ,..., vm ) T • 

of use rs’ p reference on the retrieved documents for search en¬ 
gines [30] or from recommender systems [33]. To our knowl¬ 
edge, there is no work adopting MPT into the revenue op¬ 
timisation in online advertising. In our paper, we present a 
novel way of using MPT and it is naturally integrated into 
our bid optimisation framework. 


3. STATISTICAL ARBITRAGE MINING 

In this section we formulate and solve the SAM problem 
in the context of RTB display advertising. Our paper is 
intended to be self-contained, but for a detailed introduction 


of RTB and its ecosystem, we refer to 32 35 


3.1 Problem Definition 

Let us suppose there is an ad agent acting on behalf of 
advertisers to run their ad campaigns. To hedge advertisers’ 
risk, quite often an ad agent gets paid on the basis of the per¬ 
formance: receive a payoff each time a placed ad eventually 
leads to a product purchase (cost-per-action, CPArj Note 
that it remains active research to determine whether and 
how much a purchase action is attributed to the previously 
ads shown to the user. In this paper, we adopt the last-touch 
attribution model commonly used in the industry - the last 
ad impression before the user’s conversion event is assigned 
with the full attribution credit [9]. To run the campaigns 
and place the ads, the agent then goes to the RTB market 
to purchase ad impressions. In RTB, the ad agent pays the 
cost for each ad impression displayed (cost-per-mille, CPM) 
on the basis of second-price auction. In essence, the ad agent 
is an arbitrageur, making a profit so long as the payoff by 
conversions (CPA) is higher than the cost (CPM) of acquir¬ 
ing relevant users to making the purchase. Potentially, the 
agent could in parallel run a large number of campaigns 
from various advertisers to scale up their profit. Note that 
the ad agent builds their business by taking the risk from 
the uncertainty of market competitions and user behaviours. 
For the entire ad ecosystem, it is healthy as it protects both 
advertisers and publishers by introducing an intermediary 
layer that exploits (and ultimately remove) the discrepan- 

3 A notable example is mobpartner. com who explicitly offers 
payoffs (CPA deals) for anyone who can acquire the needed 
customers programmatically. 



Figure 1: An ad agent running a meta-bidder (arbi¬ 
trageur) for statistical arbitrage mining. 


cies between market segments (in this case, the two pricing 
schemes, CPA and CPM). 

Traditionally, these arbitrages are accomplished manually. 
With statistical approaches, it is possible that the above 
operations can be automatically done by an intelligent meta¬ 
bidder across campaigns, where for a certain CPA campaign, 
the meta-bidder seeks cost-effective ad impressions with high 
conversion possibility and low market competition. 

Mathematically, we formulate the problem below: Sup¬ 
pose there exist M CPA-based campaigns. Each campaign 
i has set its payoff for a conversion as rt. Over period T, 
the meta-bidder keeps receiving bid requests at time t € 
{1,..., T}, where each bid request is represented with high 
dimensional feature vector xt and if won, it is charged based 
on CPM. For each of the incoming bid requests, the Statisti¬ 
cal Arbitrage Mining (SAM) problem is to select a campaign 
and specify its bid such that over the period T the expected 
total arbitrage net profit (accumulated payoff minus cost) is 
maximised. 

We consider the following process. When a bid request 
comes, the meta-bidder samples campaign i with probability 
Vi to participate the RTB auction, where YlfLi v i = 1- Once 
campaign i is selected, the meta-bidder then estimates its 
conversion rate (CVR), denoted as 9(xt,i), i.e., if the ad 
is placed in this impression, how likely the underlying user 
will see the ad and eventually convert (purchase) |21 . After 
that, the meta-bidder generates the bid price via a bidding 
function b(0,ri) depending on CVR 9(xt,i) and conversion 
payoff r; [34]. The notations are summarised in Table [T| a n 
illustration on how the SAMer works is given in Figure [ij 

Given campaign selection probability v and bidding func¬ 
tion b(9,r ), the meta-bidder’s total arbitrage net profit is 
given by summation over bid requests and campaigns: 

T M 

R(v, b(9, r)) = ^2 (d{xt, i)ri - b(9(x t , i), nfj • 

t =1 i=1 

w(b(9(x t ,i),ri))vi, (1) 

where w(b) is the probability of winning an RTB auction 
given bid b. Product w(b)vi specifies the probability a cam¬ 
paign is selected and wins the auction; ( 9ri — b) is net profit 
for the winning campaign. The total cost upper bound is 

T M 

C(v , b(9, r)) = ^2 ’Y2 ) {9{xt,i),r i )w(b(6(x t ,i), r \))vi, (2) 

t=l i= 1 

where bid price b is the maximal possible cost for a campaign 
to be placed due to the second price auction [31 . 

Next, we need to model how likely we will see an ad im¬ 
pression with feature xt in the future. We assume xt ~ 
p x (x t ); that is for a relatively short period, the bid request 
feature is drawn from an i.i.d. built from historic data. The 





































whole model needs to be re-trained periodically with the 
latest data. Detailed empirical study on the re-training fre¬ 
quency for dynamic arbitrage will be given in Section |4.4| 
Taking the integration over x gives the expected net profit: 

E[R(v,b(9,r))] 

r M 

—T / (o(x,i)n - b(6(x,i),n))w(b(9(x,i),ri))vip x (x)da 
■'“ i =1 v 
M . 

=ry J (Ori - b(9, n)Jw(b{$, n))p l e (9)d9, (3) 

where pg(9(x, i)) = p x {x)/\\W9(x,i)\\ as there isadetermin- 
istic relationship between x and its estimated CVR 0(x,i), 
also given in [34]. Similarly the total cost is rewritten as 


E[C(v,b(9,r))]=T 


M r 

1 H 


b(9,n)w(b(9,n))pg(9)d9. (4) 


Finally, the SAM is cast as a constrained optimisation 
problem: to find campaign selection probability v and bid¬ 
ding function b(9, r) to maximise the expected arbitrage net 
profit with budget and risk constraints: 


argmax E[R] 

(5) 

b(),v 


s.t. E[C] < B 

(6) 

Var[R] < h 

(7) 

0 < v < 1 

(8) 

V 1 = 1, 

(9) 


where we use variance Var[i?] to measure the risk of the net 
profit and h is a parameter for an upper tolerable risk. 

We propose to solve the problem (Eq. |5|) in an EM fash¬ 
ion. In particular, the campaign selection probability v is re¬ 
garded as the latent factors to infer and the bidding function 
b(9 , r) is regarded as the parameter used to maximise the 
optimisation target. In E-step, we fix the current estimated 
bidding function b(9, r ) and solve the optimal campaign se¬ 
lection probability v with the constraints Eqs. qTB, S, & 
In M-step, we fix the campaign selection probability v 
and seek for the optimal bidding function b(9, r) to maximise 
the target under the budget constraint Eq. [m. When the 
EM iterations get converged, all the constraints are satisfied 
and the target is maximised. The following Section [3.2| will 
describe the detailed solut ion of optimal bidding function 
(M-step), and Section 3.3 will discuss the solution of cam¬ 
paign selection probability v (E-step). 

3.2 Optimal Arbitrage Bidding Function 

With the fixed v and the budget constraint at Eq. © , we 
have a functional optimisation problem in M-step: 

M r 

max / (9n — b(9,n)jw(b(9,ri))p l g(9)d9 (10) 

i=i J 6 
m . 

s.t. TViii / b(9,n)w(b(9,ri))p l g (9)d9 < B. (11) 

i =i J 8 

The Lagrangian C(b(9, r), A) = 

m . 

T Y, Vi / (' 9r i -(X + l)b{9,r i ))w(b(9,r i ))pi(9)d9 + XB. 

i=1 ^ 9 

( 12 ) 




(a) Winning function 


(b) CVR pdf 


Figure 2: Linear winning function w(b(9)) and beta 
CVR pdf p e {9). 

Taking its functional derivative w.r.t. b(9,r ), we have 

M 


dC(b(9, r), A) 
db(9,r) 


= T 


[( 9n - (1 + A )b(9,n)) 


dw(b(9, n)) 


db(9, n) 

- (1 + \)w(b(9,n))]vip' e (0). (13) 
A sufficient condition of making this derivative be zero is 

(i(»> 

for all campaign i. With the specific functional form of win¬ 
ning function w(b) we can derive the optimal SAM bidding 
function. Below we show solutions in two special cases. 

3.2.1 Uniform Market Price Solution 
Here we make a si mple example of linear winning func¬ 
tion form (see Figure]2(a)|) based on the assumption of the 
uniform market pricqj distribution in [0, Z]: 

K9,r) 


w{b(9,r)) = 


l 


(15) 


where the function domain is also [0, l], l is the upper bound 
of bid price and t here is no nee d to bid higher than l. 

Replacing Eq. ( |15[ ) into Eq. ( |14[ ) gives the optimal arbi¬ 
trage bidding function fo sam i as 

fesami (0 j r ) = 2(1 + A) ’ (16) 

To calculate optimal A, a sufficient co ndit ion of the partial 
derivative dC(b{9,r), \)/d\ = 0 in Eq. |l2|) is 


l 


b(9,r)w(b(9,r))pg(9)d9 = 


Taking Eqs. ([15] and (16 ) into Eq. ( |l~] gives 

2 


4(1 + A ) 2 l Jg 


[ 9 2 p g {9)d9 
Je 


B 

T’ 


(17) 


(18) 


(19) 


Replacing Eq. ( |19] into Eq. ( |16[ gives the final solution of 
bidding function 


( 20 ) 


4 Market price refers to the highest bid price amongst the 
competitors for each auction 1 . From a bidder’s perspec¬ 
tive, it can win an auction if tne its bid price is higher than 
the market price on this auction. 


where if we denote cj>= f g 9 2 p g (9)d9, we have 


2 V Bl 


















where surprisingly the bidding function does not de pen d on 
r. This is because th e linear forms of w(b) in Eq. ( |15| ) and 
fcsamiffl , r) in Eq. (16 1 make 9 factorised out from r/( 1 + A) 
in Eq. ( |18| ) , which in turn removes the factor of r/(l + A). 
4> depends on the probabilistic distribution pg(9) , e.g. , the 
beta distribution Beta(2,8) as shown in Figure 2(b) [ and 
can be calculated with empirical data. 


3.2.2 Long Tail Market Price Solution 
We now consider a more practical winning function used 
in [34], which is based on a long tail market price distribution 
Pz(z ) = l/(z + l) 2 with parameter l. As such, the winning 
function is 


w(b(6,r)) = 


l 


b(6,r) 


p z {z)dz = 


b(9,r) 
b(9, r) + T 


( 21 ) 


The real-world data analysis on winning bids in [34] demon¬ 
strates the feasibility of adoptin g t he winnin g fu nction in 
Eq. (21 1 in practice. Taking Eq. (21 1 into Eq. (|14| gives the 
optimal arbitrage bidding function £> sam 2 as 


against b(9, r, A), b(9 l k ,r, A )w(b(9 k , r, A)) has a monotonic re¬ 
lationship with A. For example, with the bid ding function 
as Eq. (22 1 and the winning function as Eq. ( |21|) , the fac¬ 
tor b(9l,r, X)w(b(9 k ,r, A)) decreases monotonicaliy against 
A, which makes the optimal solution quite easy to find. 


3.3 Optimal Campaign Selection 

Fixing the resolved optimal arbitrage bidding function b(9 , r) 
from previous M-step, we can optimise the campaign selec¬ 
tion probability v and check whether it is better to reallocate 
the volume for each campaign. 

We here introduce the concept of SAM net profit margin 
7 in RTB display advertising. The net profit margin is the 
ratio of the net profit of the advertising, either from one 
campaign or a set of them (meta-bidder), divided by the 
advertising cost during the corresponding period. In fact, 

7 = R/C = ROI—1. 7 is a random variable with expectation 
and variance. By modelling 7 i for each campaign i, the 
optimal campaign selection can be solved by portfolio-based 
risk management methods. 


5sam2 ($, r) — 


rie 

1 + A 


+ l 2 -l, 


( 22 ) 


which is in a concave form w.r.t. CVR 9. 

Solution of A. It is possible that the optimal situation does 
not exhaust the budget and we can leverage the training data 
to tune the empirically best A as a parameter. However, if 
we assume that the bid request volume T is large enough 
to exhaust the b udge t, then the optimal case is an equality 
condition for Eq. 0- To calculat e th e optimal A, t he E uler- 
Lagrange condition of A is Eq. (ITT I. With Eq. ( |22| |, we 
explicitly regard A as an input of bidding fu nct ion b(9 , r, A) 
and rewrite dC{b(9 , r), A)/<9A = 0 from Eq. (12 1 as 


ff r r> 

5 ~2vi j b(9,n,\)w(b{9,n,\))pg(9)d9 = —. (23) 


In most situations except some special cases like Section [3.2.1[ 
A has no a naly tic solution. For numeric solution, we can 
rewrite Eq. (231 as 


M r n 

J X)w(b(9,n, A)) - —)pe( 6 »)d 6 » = 0 , (24) 


which has the same solution with the minimisation problem 


/* 1 R 2 

mm5> / -(b{9,ri,X)w{b{9,ri,\)) - —) p l e (9)d9. 
i =1 J 9 

If we have a very large number N t of observations of 9' s 
for each campaign i, we can write the above equation as 


3.3.1 Single Campaign 

With optimal arbitrage bidding function b(9, r) by Eq. (141, 
we calculate the expectation and variance of the net profit 
margin 7 , for each campaign i by 


/i i (6)=E[74]=E[ ^ ( ” i - 1 ’^ l, 
L O i , 0) J 


^w=E[5Ai=i^]-E(^=i^r, 

lCi[Vi = i, b) 2 \ lCi(Vi =1 , b) J 


Ri («i=i, b) ] 2 


(27) 

(28) 


where Ri(vi = i,b) and Ci(vi = i,b) are as in Eqs. (fTh and H 
with Vi = 1 and Vj = 0 for all other campaign j. Both p.i{o) 
and cr 2 (b) can be estimated from MCMC methods: (i) repeat 
N times on sampling T bid requests from the training data 
and calculate Ri(vi=i,b) and Ci(i 7 = i,fe), then (ii) calculate 
the expectation and variance using these N observations of 
Ri{v i=1 ,b) and Ci(v i=1 ,b). 


3.3.2 Campaign Portfolio 
Suppose there are M campaigns in the meta-bidder with 
CPA contracts. For each campaign i, as discussed in Sec¬ 
tion [ 3 T 3 TT] there is a variable of achieved net profit margin 7 i 
given the bidding function &(), and its expectation is Pi{b) 
and standard deviation is cr;(6). As such, the vector of ex¬ 
pected net profit margins for these M campaigns is 

p.(b) = ( p 1 (b),p 2 {b),...,p. M (b)) T (29) 

and the covariance matrix for the net profit margins of the 
M campaigns is S(&) = {<Jij ( 6 )}i=i...M,j=i...M, where each 
element 


M JV; 2 

2 (b(9l, r i,\)w(b{9k, r i,\)) - y) , (25) 

i= 1 k=1 

where we can use (mini-)batch descent or stochastic gradient 
descent to solve A by the following iteration: 


A - 9 Y2 Vi { b (9l,n, \)w(b{0i,ri, A)) - (26) 


i= 1 k=1 


( wm :riA)) + b(elri}X) ^(X^n.A)) 


)• 


cri,j(b) = /3ijai(b)<Tj(b), (30) 

where 3i,j € [— 1 , 1 ] is the net profit margin correlation fac¬ 
tor between campaign i and j, which can be calculated by 
routine given the net profit margin time series of the two 
campaigns i and j [24] . 

We call such probabilistic campaign combination as cam¬ 
paign portfolio. With the campaign selection probability v, 
the campaign portfolio expected net profit margin and its 
variance are 

p p {v, b) = v T p,(b), o-p(u, b) = u t £( 6 )d. (31) 


until convergence. Usually, as b(9, r, A) has a monotonic re- Generally, the arbitrage net profit margin may change w.r.t. 

lationship with A and w(b(9,r, A)) monotonicaliy increases the allocated volume: the more bid request volume, the more 

























Algorithm 1 Statistical Arbitrage Mining for Display Ads 
Require: Meta-bidder winning function w(b) 

Require: CTR distribution Pq(6) for each campaign i 
Initialise b(0,r) = rO and v = 1/M. 
while not converged do 
E-step: 

Get n{b) and E(6) by E q. |29 j) and Eq. ( |3Q[ ) 

Solve optimal v by Eq. (|32| 

M-step: 

Get the bidding function form by w(b) and Eq. Jl4|) 

Solve A by Eq. p6) 

Update the SAM bidding function b(6 , r) by Eq. \22\ 
end while 
return v and b(6, r) 


statistical arbitrage opportunities, and the higher margin. 
For simplicity, we assume that the net profit margin dis¬ 
tribution does not change much w.r.t. the auction volume 
allocated to the campaign during a short period. The em¬ 
pirical results in Section [4. 3 1 will demonstrate the eligibility 
of the assumption. 


3.3.3 Campaign Portfolio Optimisation 
The E-step of the original optimisation problem Eq. (5), 
with the fixed bidding function and constraint Eqs. 0, 

& 0, can be rewritten by taking the Lagrangian as 

(32) 


v T fi(b) — av T Y (b)v, 


s.t. 


T. 

V 1 = 


1, 0 < v < 1, 


where the Lagrangian multiplier a acts as a risk-averse pa¬ 
rameter to balance the the expected net profit margin and 
its variance. This optimisation framework is widely used as 
portfolio optimisation (30, |33|. 

When the risk, i.e., the variance of the net profit margin, 
is not considered, a is set as 0. Then the campaign i with 
the highest fM(b) will be always selected, i.e., Vi = 1, while 
Vj = 0 for all other campaigns j. 

Finally, the overall operations to get the optimal campaign 
selection probability v and the arbitrage bidding function 
b(9, r ) are summarised in Algorithm [I] Theoretically, just 
like the EM algorithms for likclihooamaximisation, every 
EM iteration in our case will at least not drop the expected 
net profit (Eq. 0). In practice, v and b(9,r) will get con¬ 
verged within 5 EM iterations. For E-step, the computa¬ 
tionally costly parts are the MCMC methods for evalu atin g 
the margin of M individual campaign (Eqs. (271 and ( |28| ) ), 
where the time complexity is Q( MN T), and the campaign 
correlation calculation (/3 ij in Eq. (301), which is 0(M 2 NT). 
For M-step, the bidding function is derived with closed f orm ; 
the calculation of A by numeric descent methods Eq. ( |26| ), 
which depends on the data values but is normally much ef- 
ficient. The performance in Section |4.4| will demonstrate 
the capability of our proposed solution for highly efficient 
re-training in dynamic arbitrage tasks. 


4. EXPERIMENTS 

4.1 Experiment Setting 

4.1.1 Datasets 

We conduct our experiment0based on two real-world large- 
scale bidding logs collected from two DSP companies. 

5 The experiment code has been published at https:// 
github.com/wnzhang/rtbarbitrage 


iPinYou RTB dataset was published after iPinYou’s global 
RTB algorithm competition in 2014. This dataset contains 
the bidding and user feedback log from 9 campaigns dur¬ 
ing 10 days in 2013, which consists of 64.75M bid records, 
19.50M impressions, 14.79K clicks and 16K CNY expense. 
The dataset disk size is 35GB. According to the data pub¬ 
lisher 23 , the last three-day data of each campaign is split 


as the test data and the rest as the training data. More 
statistics and analysis of the dataset is available in [35]. 

BigTree RTB dataset is a proprietary dataset from our 
partner DSP company BigTree Times Co. This dataset is 
collected from Nov. 2014 to Feb. 2015 for 3 iOS mobile game 
campaigns. It consists of 10.85M impressions and 46.38K 
action0with $0,083 CPA. We use this dataset to train the 
model and conduct online A/B test on BigTree DSP during 
Feb. 2015. 

Both datasets are in a record-per-line format, where each 
line consists of three parts: (i) the features for this auction, 
e.g., the time, location, IP address, the URL/domain of the 
publisher, ad slot size, user interest segments etc.; (ii) the 
auction winning price, which is the threshold of the bid to 
win this auction; (iii) the user feedback on the ad impression, 
i.e., click, conversion or not. 


4.1.2 Evaluation Protocol 

Evaluation procedure. We adopt the evaluation pro¬ 


cedure similar to the previous work on bid optimisation 34 


|35| . In addition, for the evaluation related to the campaign 
sampling process (via v), we follow an offline evaluation scheme 
similar to previous work on evaluating interactive systems 
[22] . As in the historic data, the user’s feedback is only 
associated with the winning campaign of the auction, there 
is no corresponding user feedback if a different campaign is 
sampled. As such, based on the bid request i.i.d. assumption 
made before, for each round, we first sample a campaign i, 
then pass the next test data record of this campaign to the 
bid agent for bidding. If there is no more test data left for 
this campaign, i.e., the bid requests are run out, the test 
ends. 

Budget constraints. It is easy to see that if we set the 
budget the same as the original total cost in the test log, 
then simply bidding as much as possible for each auction 
will exactly run out the budget and get all the logged clicks. 

In our work, to test the performance against various bud¬ 
get constraints, for each campaign, we respectively run the 
evaluation test using 1/2,1/4,1/8,..., 1/256 of the original 
total cost in the test log as the budget. 

Payoff setting. To set up various difficulties in arbi¬ 
trage, for our offline experiments, we manually adjust the 
CPA payoff for each iPinYou campaign. Specifically, for each 
campaign i, we set a high and a low CPA payoff in order to 
test the algorithms’ performance under an easy and a hard 
arbitrage situation, denoted as r aasy and r) lar , respectively: 

r® asy = eCPAi x 0.8 and r/ ard = eCPA* x 0.2, 

where eCPA; is the original average cost for acquiring each 
conversion of campaign i in the training data without any ar¬ 
bitrage strategy. In addition, the conversion data in iPinYou 
is unavailable for 7 out of 9 campaigns. To have more tests 
done, we thus regard the user clicks as a proxy for the desired 
actions (conversions) in our offline experiment. 


6 According to the advertiser’s contract, here the action is 
defined by users’ landing on the game’s page on app store. 














To complement the offline tests, in our online experiments, 
we directly adopt the CPA payoff specified by genuine ad¬ 
vertisers to test the real business case. 


4.1.3 Compared Strategies 
We compare the following baseline and state-of-the-art bid¬ 
ding strategies in our experiment. Their parameters are tuned 
on the training data. 

Constant bidding (const). A constant bid regardless bid 
requests and campaigns. Though trivial, it is a simple 
solution widely used by many DSPs. 

Random bidding (rand). Randomly choose a bid value in 
a given range. 

Truth-telling bidding (truth). If there is no budget con¬ 
straint, one should bid the true value for each ad im¬ 
pression, which is CPAxCVR of the impression 


21 


Linear bidding (lin). In |28], the bid value is linearly pro¬ 
portional to the CVR with the bid scale parameter 
tuned to maximise the expected conversion number. 
Optimal real-time bidding (ortb). This is an optimal bid¬ 
ding strategy proposed in [34] to maximise_clicks. Here 
we compared the ortbl bid strategy in 


34 


Statistical arbitrage mining (saml, sam2). These are the 
two biddi ng s trategies proposed in thi s pa per: saml is 
from Eq. ( |16[ ) and sam2 is from Eq. ( |22| ), collectively 
denoted as samx. 

SAM with competition modelling (samlc, sam2c). In 

a real online environment, the advertisers will tune their 
bidding strategies according to their campaign perfor¬ 
mance. If many bidders adopt our samx bidding strate¬ 
gies, it is possible that this may change the market 
prices. In our offline empirical study, we follow [36 
to adopt the opt bidding strategy [6 to simulate the 
market price changes towards a locally envy-free equi- 


cetpr 

libriunfj Note that this is not for comparing bidding 
strategies but for comparing auction environment where 
we would like to check whether our proposed samx al¬ 
gorithms would still make arbitrage profit when the 
market changes according to our actions. We only com¬ 
pare the performance of samx algorithms with those in 
the corresponding samxc settings. 

For campaign selection strategies, we compare the uniform 
campaign selection, i.e., v = 1/M, and the portfolio-based 
campaign select ion , where portfolio will be denoted as greedy 
when a in Eq. ( |32| ) is set as 0. The conven tion al campaign 
selection scheme based on internal auctions f32l will be com¬ 


pared in online A/B test in Section 4.5 


4.1.4 Evaluation Measures 
We use the net profit as the prime evaluation measure, 
which is calculated as #conversions * cpa_payoff - cost. 
We also evaluate the net profit margin for each strategy, 
which is calculated by the net profit divided by the cost. In 
addition, we report the number of impressions and conver¬ 
sions as well as the cost for each strategy. 


4.2 Single Campaign Arbitrage 

In Table[2] we report the overall performance on the tested 
9 campaigns from the iPinYou dataset. We see that samx 
bidding strategies outperform all others regarding to the net 

' The work [6] is on sponsored search with generalised second 
price auctions. By setting the slot number for each keyword 
auction as 1 and the CTR as 1.0, the opt bidding strategy 
can be used for our display advertising scenario. 


Table 2: Single-campaign overall performance. 
Easy payoff, 1/16 budget setting 


bid. 

algo. 

profit 

(CNY) 

margin 

bids 

(M) 

imps. 

(K) 

cnvs. 

cost 

(CNY) 

const 

41.77 

0.21 

2.68 

761.91 

297 

194.44 

rand 

19.65 

0.12 

2.97 

612.90 

223 

166.60 

truth 

749.75 

3.60 

1.89 

420.19 

1,137 

208.33 

lin 

845.22 

3.83 

2.71 

531.49 

1,161 

220.90 

ortb 

869.43 

4.03 

2.87 

632.38 

1,172 

215.78 

saml 

1,141.72 

6.02 

3.26 

471.46 

1,504 

189.55 

sam2 

1,161.24 

5.97 

3.42 

606.97 

1,534 

194.40 

samlc 

1,118.61 

6.10 

3.24 

389.09 

1,473 

183.34 

sam2c 

1,141.01 

5.87 

3.41 

563.74 

1,513 

194.38 


Hard payoff, 1/16 budget setting 


bid. 

profit 

margin 

bids 

imps. 

cnvs. 

cost 

algo. 

(CNY) 


(M) 

(K) 


(CNY) 

const 

-1.40 

-0.25 

4.10 

81.55 

10 

5.53 

rand 

1.08 

10.47 

4.10 

8.36 

4 

0.10 

truth 

214.08 

2.13 

4.03 

373.66 

1,430 

100.30 

lin 

45.63 

0.21 

2.71 

531.49 

1,161 

220.90 

ortb 

55.52 

0.26 

2.87 

632.38 

1,172 

215.78 

saml 

207.34 

2.29 

3.89 

319.77 

1,328 

90.59 

sam2 

227.76 

3.77 

4.10 

301.99 

1,326 

60.47 

samlc 

204.73 

2.25 

3.88 

308.44 

1,322 

90.98 

sam2c 

225.95 

3.70 

4.10 

298.51 

1,322 

61.13 



algorithm 



(a) net profit (easy) 


(b) net profit margin (easy) 


Figure 3: Single campaign arbitrage performance. 


profit. sam2 further outperforms saml particularly in the 
hard payoff settings due to its more practical winning func¬ 
tion. In addition, samxc strategies still make high arbitrage 
profit with the market competition modelling, which demon¬ 
strates the potential of samx strategies in a real market com¬ 
petition environment. 

Furthermore, we monitor the performance change on the 
arbitrage net profit and margin of each algorithm w.r.t. the 
budget setting in Figure[3] For the page limit, we only report 
the results with the easy payoff setting, while the results on 
the hard payoff setting are similar. The value on the x-axis 
means the proportion of the original total cost in the test 
data divided by the test budget. The higher the proportion 
is, the less the budget is. From Figure[3]we have the follow¬ 
ing observations, (i) saml and sam2 outperform the rest in 
almost all the profit and margin comparisons with different 
budget settings, (ii) Under the higher budget setting, e.g., 
2 or 4 budget proportions, truth produces comparable profit 
as samx. This is because when the budget is abund ant, the 
tight budget constraint (i.e., the equation in Eq. 0 ) is 
unnecessary to meet in order to maximise the net profit. 
Under such situation, the bidding problem will get back to 
the classic second price auction problem, where the truth¬ 
telling bidding strategy is optimal [Tl . (iii) Under the lower 
budget setting, e.g., 64, 128 and 256 budget proportions, the 
profit from truth drops significantly because of the budget 
constraint is quite important and the optimal bidding strat¬ 
egy is never truth-telling. On the contrary, lin and ortb act 
almost the same as samx. This is reasonable because under 
the lower budget settings, the budget is always exhausted. 
With the cost the same as the budget, the more conversions 
the more arbitrage profit. 















(a) net profit (easy) 


(b) net profit (hard) 


(c) competition profit drop (easy) 
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Figure 4: Multiple campaign arbitrage performance comparison and campaign portfolio selection analysis. 


Table 3: Multi-campaign overall performance. 


strategies 
bid. cam. 

algo. select. 

easy payoff 
profit margin 
(CNY) 

hard payoff 

profit margin 
(CNY) 

lin 

greedy 

501.12 

6.63 

68.59 

0.91 

lin 

portfolio 

925.45 

13.11 

181.54 

2.50 

lin 

uniform 

747.00 

9.53 

127.14 

1.62 

ortb 

greedy 

517.02 

6.65 

70.96 

0.91 

ortb 

portfolio 

802.15 

10.32 

146.13 

1.88 

ortb 

uniform 

765.12 

9.89 

133.16 

1.72 

saml 

greedy 

966.02 

20.81 

230.38 

11.13 

saml 

portfolio 

1,037.98 

15.84 

240.63 

7.96 

saml 

uniform 

768.38 

9.78 

172.43 

7.57 

sam2 

greedy 

961.68 

28.73 

235.31 

24.00 

sam2 

portfolio 

983.01 

17.21 

248.65 

13.61 

sam2 

uniform 

774.09 

10.32 

168.15 

5.16 

truth 

greedy 

787.10 

14.69 

227.86 

29.05 

truth 

portfolio 

787.10 

14.69 

242.07 

18.34 

truth 

uniform 

326.57 

4.14 

101.12 

5.36 


4.3 Multiple Campaign Arbitrage 

We test 6 campaign portfolios from the iPinYou dataset. 
Each portfolio contains 4 campaigns with the data from the 
same period. For each portfolio, after the convergence of EM 
iterations, the empirically optimal v and bidding function 
b(6, r) are deployed in the campaign portfolio’s test stage, 
where the auction volume and the budget are set as the same 
as in the training stage. Compared with the previous single 
campaign part, this part of experiment focuses more on the 
campaign portfolio selection, where the uniform, greedy and 
portfolio selection methods are compared. 

The overall results with 1/32 budget setting are reported 
in Table [3] For the comparison among the bidding strate¬ 
gies, samx overall outperforms others in both payoff settings. 
Figure [4] provides more detailed analysis. The pr ofit tr end 
against the budget setting, as shown in Figures |4(a)| and 
IS] is consistent with the single campaign setting. The 


competitor model setting does not s ignifi cantly drop the ar¬ 
bitrage net profit as shown in Figure |4(c)| Specifically, when 
the budget gets lower, the profit drop percentage gets lower. 
The reason is that fewer auctions are won with lower budget 
so that the market does not change much. To compare cam¬ 
paign selection, Table [3] shows that portfolio selection con¬ 
stantly outperforms uniform and greedy selection. Compared 
with uniform, greedy allocates all the auction volume and 
the budget onto the campaign evaluated as with the highest 
arbitrage net profit margin, which theoretically maximises 


the expected net profit. However, the result that portfo¬ 
lio outperforms greedy indicates there exists a return-risk 
tradeoff point which practically generalises better tha n the 
maximum expectation solution. Furthermore, Figure |4(d)| 
shows the change of total profit from the 6 tested campaign 
portfolios based on sam 2 against the portfolio risk-averse 
parameter a in Eq. ( |32[ ). Here setting a as a small enough 
value is equivalent to the greedy campaign selection. As 
we can see, as a increases from 10~ 3 , the net profit first 
rises to the peak value and then drops significantly. Among 
the dif ferent budget setting, we can observe a trend from 
Figure 4(d) that is the more budget, the higher the optimal 
a is. For 1/256 budget setting, the optimal a is 0.01, while 
0.1 is optimal for 1/4 budget setting. This may be due to the 
fact that more budget brings more auction volume across a 
longer period, importing more risk, which is required to be 
carefully hedged. 

In addition, we present a case study on a campaign port¬ 
folio (3358, 3386, 3427 and 3476 are four campaign IDs). 
Its return-risk analysis plot is shown in Figure 4(e) and the 
corresponding c ampa ign selectio n pro bability allocation is 
shown in Figure |4(f)| In Figure |4(e)| the dark blue points 
stand for the expected net profit margin and its standard de¬ 
viation for 4 individual campaigns. As we can see, campaign 
3358 has the highest expected margin as well as the highest 
risk while campaign 3386 is the most stable one but with 
the lowest expected margin. The best empirical por tfolio 
selection is shown as the vertical dashed line in Figure |4(f)| 
where 94.9% auction volume is allocated to campaign 3358 
and 4.1% is allocated to campaign 3427. However, if the 
meta-bidder is more risk-averse, other two campaigns can 
be included in order to further red uce the standard devia¬ 
tion. The parameter a in Eq. ( |32[ ) provides a flexible way 
to adjusting such risk and return trade-off. 


4.4 Dynamic Multiple Campaign Arbitrage 

In practice, as the market competition and the user be¬ 
haviour change across the time, the meta-bidder should dy¬ 
namically change its bidding strategy and campaign selec¬ 
tion. In this part of experiment, we test the capability of 
our proposed sam2 bidding strategy with dynamic campaign 
portfolio selection over a 72 hour test period. The arbi¬ 
trage bidding function and campaign selection probability 









































Campaign volume allocation against time 



hour hour 


(a) easy payoff setting (b) hard payoff setting 

Figure 5: Dynamic multi-campaign arbitrage net 
profit distribution with different update frequency. 

are updated periodically, and we refer the interval between 
two updates as one round. Specifically, at the beginning of 
each round, we re-train the arbitrage bidding function and 
campaign selection probability using Algorithm [T] based on 
the bidding data collected from previous round. A problem 
here is that how frequent the update should be? It is ap¬ 
parent that if the round period is too long, it is difficult for 
the meta-bidder to catch the transient statistical arbitrage 
opportunities; if the round period is too short, the training 
data could be sparse and the model might overfit the data. 

We test the dynamic multiple campaign arbitrage on 5 
portfolios, each of which consists of 4 campaigns with the 
data logged within the same period. For each test cam¬ 
paign portfolio, we try the different update frequencies as 
well as different risk-averse a’s. The box plots [26 of the 
arbitrage net profit distribution with different update fre¬ 
quencies under two payoff settings are shown in Figure [5] 
From the results we observe that (i) the positive net profit 
values over all cases demonstrate the capability of sam2 to 
make dynamic arbitrages, (ii) In both payoff settings, the 
dynamic SAM (period no more than 24 hours) have much 
better performance than the static SAM (period equals to 72 
hours, i.e., only one update), which indicates the importance 
of dynamically re-training the models to catch the latest 
market situation, (ii) Among the different frequencies of 
dynamic updating, updating every 6 hours leads to the high¬ 
est arbitrage net profit. We believe this is a trade-off point 
between the abundance and recency of the training data. 
Note that the optimal update frequency may be different 
for other campaigns or different training settings. 

In addition, Figure [6] presents a case study of the 72 hour 
dynamic 4-campaign arbitrage with the model updated for 
every 6 hours. In each round, the calculated campaign selec¬ 
tion probability (i.e., the volume allocation) from portfolio 
optimisation, the estimated net profit margin of each cam¬ 
paign, the empirical net profit and cost are depicted. We 
observe that the estimated margin for each campaign varies 
over time, which results in the change of campaign volume 
allocation across the time. The empirical profit shows the 
same trend with the estimated campaign margin, which to 
some extent highlights the effectiveness of the margin esti¬ 
mation in our model. Moreover, the cost in each round (i.e., 
6 hours) is different, not necessarily be the average budget 
allocated for each round. It is possible that if the market is 
too competitive to make arbitrage profit, the resulting cost 
and profit could be both much low. 

4.5 Online Test 

Our SAM algorithm has been deployed and tested in a live 
environment provided by BigTree DSP. The model training 
follows the scheme in Section |4.3| Specifically, with Algo¬ 
rithm [l] we obtain the empirically optimal sam2 bidding 



Figure 6: A case study of dynamic multi-campaign 
arbitrage performance and the corresponding mar¬ 
gin estimation and volume allocation. 
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Figure 7: Online performance on BigTree DSP. 


function b(9, r) and campaign selection probability v for the 
meta-bidd er bas ed on the 3-campaign training data de scrib ed 
in Section Em where the hyperparameter a in Eq. ( |32[ ) is 
set as 0.1. As a control baseline, we deployed a noth er meta¬ 
bidder with the basic linear bidding function 28 [21 and 


the internal auction-based campaign selection scheme 32], 
denoted as base. During the online A/B test, every received 
bid request from the router of BigTree DSP will be randomly 
assigned to either of the two meta-bidders, which returns the 
bid response, including the selected campaign ad and the bid 
price, back to the ad exchange for auction. The online test 
is conducted during 23 hours between 13 and 14 Feb. 2015 
with $60 budget for each meta-bidder. 

Figure [7] presents the overall online performance of sam 
and the baseline algorithm base. The online results on the 
commercial DSP verify the effectiveness of our algorithm in 
a real commercial setting: sam leads to $30.6 arbitrage net 
profit with $60 budget, which is a 51.1% margin and a 31.8% 
improvement over the base bidder setting. An interesting 
observation is that in spite of the higher CPM, sam brings 
lower eCPA than base, which ultimately leads to higher arbi¬ 
trage net profit. This suggests that despite the market price 
and arbitrage margin are different across the campaigns, our 
SAM algorithm would be able to successfully identify and 
target to the cases that have higher arbitrage margin from 
those high value impressions (reflected by their high CPM). 

















5. CONCLUSIONS 


In this paper, we conducted the first study on statistical 
arbitrage mining in RTB display advertising. We proposed 
a joint optimisation framework to maximise the expected 
arbitrage net profit with budget and risk constraints, which 
is then solved in an EM fashion. In the E-step the bid vol¬ 
ume is reallocated according to the individual campaign’s 
estimated risk and return, while in the M-step the arbitrage 
bidding function is optimised to maximise the expected arbi¬ 
trage net profit with the campaign volume allocation. Aside 
from the theoretical insights, the offline and online large- 
scale experiments with real-world data demonstrated the ef¬ 
fectiveness of our proposed solution in exploiting arbitrage 
in various model settings and market environments. We 
believe this would open up a whole new set of research ques¬ 
tions that intersect between financial methods suc h as high- 
frequency trading 15], risk-management [24, 12 and data 
mining methodologies for display advertising and beyond. 
In the future work, we plan to further improve the dynamic 
nature of the SAM model and extend it to mine arbitrage in 
other domains such as cloud computing and e-commence. 
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